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To  simplify  the  sum  in  the  above  equation  we  separate  it  into  two 
parts:  k  ^  j  and  k  =  j.  We  also  note  that 

(N-ej)k  —  Nk  k^j 

and 

(N  -  ej)k  =Nk-\  k  =  j. 

We  then  obtain 

Lm (N  -  ej)  =  N *  +  DmkJ(N) 

k  =  l  I  k 
k¥j 

HNj- 1)  ~p-+DmJj(N)  .  (7) 

Simple  algebraic  manipulation  results  in  the  following  form  of  the 
above  equation. 

Lm(N  - ef)  =  Lm(N)  -  +  D'mJ(N)  -  DmjJ(N)  (8) 

where  D'mj(N)  =  £*_,  NkDmkJ(N).  _  _ 

Using  the  same  development  as  above  when  M  =  N  —  ec ,  we  have 

Lm(M  -ej)  =  Z.m (AJ)  -  +  D'mj(M) 

(M)j 

—Dmjj  (M)  -  Dmcj  (M)  (M)j  >  0.  (9) 

Assuming  that  the  values  for  the  D'mj(M)  Mm,  j,  M  =  iV  and  M  — 
N  —ec  are  available  a  priori,  then  the  cost  of  the  Core  algorithm  is 
easily  seen  to  be  0{MK). 

Now  consider  the  computation  of  the  D'mj(Af)  Mm,j  in  the  context 
of  the  Linearizer  algorithm.  In  each  top  level  iteration  of  Linearizer, 
the  Core  algorithm  is  called  once  for  population  M  —  N  and  for 
each  population  M  =  {N  —  ec ),  c  =  1,2,--  ,K.  If,  for  each  of 
these  calls  to  the  Core  algorithm,  it  was  required  to  recompute  the 
D’mj  (population  vector)  then  each  Core  algorithm  call  would  indeed 
cost  0{MK2).  However,  in  Linearizer 

Dmkj(N)  =  Dmkj(tf  —ec)  Mm,  k,  j,  c  (10) 

and  thus 

Dfmj(N)=D'mj(N-ec)  Mm,kJ,c.  (11) 

Therefore,  we  can  precompute  DfmJ(N )  Mm,  j  at  a  cost  of  0(MK2) 
and  use  these  values  for  each  of  the  K  1  calls  to  the  Core  algorithm. 
It  is  simple  to  see  that  the  cost  of  Linearizer  is  then  0(MK 2). 

In  summary,  the  following  modifications  are  made  to  the  original 
Linearizer  algorithm. 

1)  In  Steps  1  and  5  of  the  Linearizer  algorithm,  compute 
D'mj(N)Mrn,  j  prior  to  all  other  computations  and  store  for  use  in 
the  calls  to  the  Core  algorithm  during  Step  2  and  Step  3. 

2)  _Step  2  of  the  Core  algorithm  is  replaced  by  _a  computation  of 
Lm(M  —  ej)  Mm,  j  using  (8)  if  M  —  N  or  (9)^ if  M  =  N  —  ec  with 
the  precomputed  values  for  D'  (N)  (~D’mJ(N  -e)))  and  Dmkj(N) 
( ~Dmkj  (N  —  ec )) . 

IV.  Conclusion 

We  have  shown  how  Linearizer  can  be  reorganized  to  reduce  the 
computational  cost  to  0(MK2).  This  is  accomplished  without  alter¬ 
ing  the  algorithm  in  any  way  that  affects  the  results  and  thus  preserves 
the  empirical  evidence  of  the  accuracy  of  the  method. 

It  is  tempting  to  consider  the  reduction  of  the  space  requirements 
of  the  Linearizer  to  0{MK)  [from  0(MK2)]  since  we  need  only 


values  for  D'mj(N )  and  D„jj(N).  However,  each  call  to  the  Core 
algorithm  for  population  (A?  —  ej)  requires  the  previous  estimates  for 
LmJ(N  —e{).  Thus,  it  does  not  appear  possible  to  reduce  the  order 
of  magnitude  space  requirements  for  Linearizer  without  surgery  that 
would  materially  alter  the  algorithm. 
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On  the  Equivalence  of  Cost  Rmctions  in  the  Design  of  Circuits 
by  Costtable 

JON  T.  BUTLER  and  KRISS  A.  SCHUELLER 

Abstract— In  the  costtable  approach  to  logic  design,  a  function  is 
realized  as  a  combination  of  functions  from  a  table.  The  objective  of 
the  synthesis  is  to  find  the  least  cost  realization,  where  realization  cost 
is  the  sum  of  the  costs  of  the  functions  used,  plus  the  cost  of  combining 
them. 

The  costs  of  costtable  functions  are  defined  by  a  cost  function,  which 
represents  chip  area,  speed,  power  dissipation,  or  a  combination  of 
these  factors.  We  show  that  there  is  an  arbitrarily  large  set  S  of  cost 
functions  all  of  which  yield  the  same  minimal  realization  from  a  given 
costtable.  This  implies,  for  example,  that  every  minimal  realization  of 
any  function  over  a  cost  function  in  S  is  independent  of  the  actual  cost 
function  used.  Furthermore,  we  show  that,  with  any  cost  function,  if 
the  cost  of  combining  functions  from  a  costtable  F  is  sufficiently  large, 
the  realizations  behave  as  if  the  cost  function  belongs  to  5.  That  is, 
any  minimal  realization  of  a  function  /,  using  costtable  F,  is  one  of 
the  minimal  realizations  of  f  using  F  and  a  cost  function  in  S.  Our 
interpretation  of  these  results  is  that  there  are  not  as  many  distinct 
costtaUes  as  originally  thought. 

Index  Terms—  Cost  function,  costtable,  logic  design,  minimization, 
multiple-^ valued  logic,  synthesis. 

I.  Introduction 

In  the  costtable  approach  to  the  design  of  logic  circuits  [l]-[7], 
a  given  function  is  realized  by  selecting  functions  from  a  table  and 
combining  them.  Associated  with  each  chosen  function  is  a  cost.  In  a 
sense,  all  design  is  done  this  way.  For  example,  programs  are  formed 
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from  a  table  of  instructions  with  the  cost  being  instruction  execution 
time.  In  logic  design,  the  table  usually  consists  of  functions  which 
are  easily  designed  in  the  technology  used,  and  the  cost  can  be  chip 
area,  power  dissipation,  speed,  etc. 

The  cost  of  a  realization  is  the  sum  of  the  costs  of  the  component 
functions  plus  the  cost  of  combining  them.  Typically,  there  is  more 
than  one  way  to  realize  a  given  function,  and  die  goal  of  the  design 
is  to  find  a  minimal  cost  realization.  Kerkhoff  and  Robroek  [2]  and 
Robroek  [5]  introduce  the  costtable  technique  for  the  synthesis  of 
four- valued  unary  functions  implemented  in  CCD  (charge-coupled 
devices).  Their  proposed  table  contains  45  functions,  from  which  all 
256  unary  functions  are  synthesized.  The  cost  of  each  function  is  an 
approximation  to  the  chip  area  occupied  by  a  CCD  realization  of  that 
function,  and  the  synthesis  technique  used  is  exhaustive  search.  Lee 
and  Butler  [3],  [4]  show  a  costtable  of  24  entries  that  produces  real¬ 
izations  as  good  as  or  better  than  those  in  [2]  and  [5] .  The  proposed 
synthesis  algorithm  is  still  a  search;  however,  nonproductive  combi¬ 
nations  are  eliminated  by  using  the  transition  count  of  the  function  to 
guide  the  search.  In  general,  the  choice  of  a  costtable  is  determined 
by  the  total  cost  of  the  realizations  produced;  for  a  given  costtable 
size,  one  wants  a  costtable  that  yields  the  lowest  total  cost.  Schueller, 
Tirumalai,  and  Butler  [6]  show  minimal  and  near-minimal  costtables 
of  all  sizes,  and  from  this,  find  that  the  costtables  of  [2]  and  [3]  are 
not  minimal.  Also,  it  is  observed  that  there  is  a  point  of  diminishing 
returns  with  respect  to  costtable  size.  That  is,  while  costtables  of 
larger  size  produce  more  economical  realizations,  beyond  a  certain 
size,  about  10%  of  the  total  number  of  functions  to  be  synthesized, 
there  is  little  benefit  to  adding  more  functions  to  the  costtable.  The 
analysis  in  [6]  is  done  for  five  different  costs,  and  it  is  found  that  the 
point  of  diminishing  returns  is  approximately  the  same  for  all  costs. 

Schueller  and  Butler  [7]  show  that  the  “average”  costtable  is  sig¬ 
nificantly  less  efficient  than  the  optimal  one  for  small  costtables,  but 
very  close  to  the  optimal  one  for  large  costtables.  Since  a  randomly 
chosen  costtable  is  likely  to  be  much  worse  than  optimal  for  small 
costtable  sizes,  effective  algorithms  for  finding  minimal  costtables 
are  important  for  this  case.  This  applies  to  all  practical  applications 
of  costtables,  since  the  number  of  entries  will  be  small  compared  to 
the  universe  of  functions  to  be  realized.  In  general,  it  is  not  easy  to 
find  a  minimal  costtable.  However,  for  the  special  case  of  costtables 
of  size  one  larger  than  the  smallest  costtable,  a  minimal  costtable 
is  shown  [7],  In  addition,  it  is  shown  that  a  search  for  minimal 
costtables  cannot  exclude  certain  seemingly  useless  functions,  called 
composite  functions  that  are  most  efficiently  realized  by  summing 
other  functions. 

In  this  paper,  we  show  that  the  minimal  realization  of  functions  by 
costtables  is  relatively  unaffected  by  changes  in  cost  functions  or  the 
cost  of  combining  functions  (sum,  in  our  case).  Costtable  realizations 
are  more  robust  than  previously  suspected.  Specifically,  we  show 
that,  for  any  function,  all  minimal  realizations  under  the  linear  cost 
function  are  independent  of  the  specific  linear  cost  function  used 
(of  which  there  are  infinitely  many).  We  show  that,  for  general  cost 
functions,  if  the  cost  of  combining  costtable  functions  is  sufficiently 
large,  there  is  a  minimal  realization  of  any  function  that  is  identical  to 
a  minimal  realization  of  that  function  using  the  linear  cost  function. 
We  conclude  from  these  results  that  the  understanding  of  linear  cost 
function  is  important  to  the  understanding  of  the  costtable  synthesis 
technique. 

II.  Notation  and  Introductory  Concepts 

Let  R  =  (0,  1,  •  •  • ,  r  —  1 }  be  a  set  of  r  logic  values,  where  r  >  2, 
and  let  A"  =  {xj ,  J*r2>  •  •  ■ ,  xn  }  be  a  set  of  n  variables,  where  x ,  takes 
on  values  from  R.  A  function  f(X)  is  a  mapping  f  :Rn  — >  R.  An 
assignment  of  values  to  variables  in  X  is  represented  by  a  vector  v . 
The  value  of  f(X)  for  that  assignment  is  /(v).  It  will  be  convenient 
to  represent  a  function  by  the  tuple  (fl0,  ,  •  •  •  ,ar»_i),  where  a,  is 

the  value  of  /  for  an  assignment  of  values  to  v  which,  interpreted 
as  a  base  r  number,  has  value  /.  For  example,  a  four- valued  unary 
function  f(x )  is  represented  as  the  4-tuple  {tfo,  #3)*  where 

at  =  /(/),  for  0  <  /  <3.  Let  Un,r  be  the  set  of  all  /7-variable  r- 
valued  functions. 


Let  c(/),  the  cost  of  function /,  be  a  mapping  c:  Un?r  — ►  R ,  where 
R  is  the  set  of  real  numbers.  For  example,  the  cost  function  c(f) 
used  in  [2]  and  [5]  correlates  closely  with  the  chip  area  occupied  by 
the  most  compact  implementation  of  /. 

The  connecting  operation  is  ordinary  vector  addition.  That  is,  if/ 

is  realized  as  the  sum  /  —  /1  +  /2  + - 1 -/m,  each  component  of 

/  is  the  sum  of  the  corresponding  components  in  f\ ,  fa,  •  •  ,fm  .  If 
any  component  sum  of  the  set  of  functions  exceeds  r  -  1 ,  the  highest 
logic  value,  the  sum  is  undefined.  Let  s  be  the  cost  of  realizing 
the  vector  sum  of  two  functions.  Thus,  the  cost  of  the  realization 
/  =  f\  +  fl  +  "  ’  +/m  is  c(f  1)  +  c(f  2)  +  ■  *  c(f  m)  +  (m  —  1)5, 
where  the  last  term  is  the  cost  of  (m  —  1)  two-input  adders.  The 
two-tuple  (c,  5)  is  called  a  cost  f unction! sum  pair. 

f  is  a  basis  function  if  f  is  1  for  exactly  one  component  and  is  0 
otherwise.  Let  BT  be  the  set  of  all  basis  functions  plus  the  function 
with  all  0  components.  BT  is  called  the  basis  costtable.  F  is  a 
costtable  if  BT  C  F  C.  Un  r.  cF(f),  the  cost  of  realizing  /  €  Un,r 
with  respect  to  costtable  F  is 

c f (/)  =  min  {c(/D  +c(/2)  + - h  c(fm)  +  (m  -  1)5} 

where  /  =  f\  +  fa  +  •  •  •  +fm  and  where  c  is  a  cost  function.  The 
total  cost  T{F)  of  costtable  F  is 

T(F)  =  Y,  c? CD- 

f€Vn,r 

Ft  is  a  minimal  costtable  of  size  t  if  T{Ft)  <  T(F)y  for  all  F, 
such  that  |.F  |  =  t. 

III.  Strong  Equivalence  Between  Cost  Function /Sum  Pairs 

We  show  in  this  section  whole  classes  of  cost  function/sum  pairs 
in  which  the  realizations  of  functions  by  costtables  is  independent  of 
the  particular  cost  function/sum  pair. 

Definition :  Let  c  and  d  be  cost  functions  and  5  and  t  be  the 
corresponding  costs  of  combining  functions,  respectively.  Then,  cost 
function/sum  pair  (c,  5)  is  strongly  equivalent  to  ( d ,  t)  iff  for  any 
costtable  Fy  and  any  pair  of  functions  /i,/2  €  Unjy  cF(f 1)  <  cF(f2) 
iff  dF{f\  <  dF(fi). 

Strong  equivalence  preserves  the  relative  costs  of  implementations 
among  functions  realized  by  a  costtable.  Thus,  if  two  cost  func¬ 
tion/sum  pairs  are  strongly  equivalent,  a  minimal  realization  of  a 
function  under  one  pair  is  a  minimal  realization  under  the  other. 
Since  strong  equivalence  is  an  equivalence  relation,  it  divides  the  set 
of  all  cost  function/sum  pairs  into  equivalence  classes. 

Theorem  1:  Every  equivalence  class  induced  by  strong  equiva¬ 
lence  contains  a  cost  function/sum  pair  (c,  5),  where  5=0. 

Proof:  We  show  that  every  cost  function/sum  pair  is  strongly 
equivalent  to  a  cost  function/sum  pair,  where  the  sum  cost  is  0.  Let 
(1 d ,  t)  be  an  arbitrary  cost  function/sum  pair  in  some  class  C,  and 
consider  pair  (c,  5),  where  c(/)  =  d(f)  +  f,  for  all  f  eUn>r,  and 
where  5  =0.  We  show  that  (c,  5)  is  also  in  C.  Given  an  arbitrary 
costtable  F,  let  a  minimum  realization  of  function  /  in  F  with  respect 
to  ( d ,  t )  be  /  =/i  +  /2  +  ■  ■  ■  +fm  with  cost 

dF(f)  =  d(fy)  +  d(f2)  +  ■  ■  ■  +  d(fm)  +  (m  -  \)t 

where  /,•  €F .  Since  a  minimum  realization  of  /  with  respect  to 
cost  function/sum  pair  (c,  5)  costs  no  more  than  the  realization  /  = 

f  \  +fi  4 - Vfm >  Cpif)  <  dF(f)-\-t.  Let  a  minimum  realization 

of  /  in  F  with  respect  to  (c,  5)  be  f  =  g  1  +  g2  + - h  gp  with  cost 

cF(f)  =  c(gi)  +  c(g2)  +  ■  •  •  +  c(gp) 

where  g ■,  €  F .  Since  a  minimum  realization  of  /  with  respect  to 
cost  function/sum  pair  (d,  t)  costs  no  more  than  the  realization  /  = 
g\  +gi  +  •  •  +gP,  dF(f)  +  t  <  cF{f).  Thus,  cF(t)  =  dF(f)  +t. 
Since  cF(f )  and  dF(f)  differ  only  by  a  constant,  ( d ,  /)  and  (c,  s) 
are  strongly  equivalent.  Q.E.D. 

It  follows  from  Theorem  1  that  there  is  no  difference  in  the  realiza¬ 
tions  produced  by  a  costtable  where  the  cost  of  combining  functions 
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is  included  in  the  cost  function.  That  is,  if  the  cost  of  combining  two 
costtable  functions  is  5  and  the  cost  function  is  c{f),  then  the  minimal 
realization  of  any  function  /  is  the  same  as  in  the  case  where  there  is 
0  cost  in  combining  two  costtable  functions  and  the  cost  function  is 
c(f)  +5.  Furthermore,  given  any  cost  function/sum  pair  where  the 
sum  is  0,  there  is  an  arbitrarily  large  number  of  cost  function/sum 
pairs  which  are  strongly  equivalent  to  it.  In  such  classes,  the  cost 
functions  differ  by  a  constant.  Next,  we  show  that  strong  equiva¬ 
lence  extends  over  classes  where  the  cost  functions  are  differentiated 
by  more  than  a  constant. 

Definition :  Given  /  =  (a0»  «i,  ■  •  ■  the  linear  cost  of  / 

is 

DC  ( (ao,  a i ,  •  •  •  ,ar»—i ))  =  koOo  k\0\  4-  •  ■  •  4-  kr* _]  arn ~ \  4-  k. 

For  example,  consider  unary  four- valued  functions.  If  k\  —  1 
and  k  —  0,  LC({1122))  =  6  and  LC({2031))  =  6,  and  the  linear 
cost  is  identical  to  the  sum  cost  discussed  in  [7].  If  kj  =  k  =  0, 
LC  ( (1 122))  =  0  and  LC( (203 1 ))  =0,  and  the  linear  cost  is  identical 
to  the  constant  0  cost  discussed  in  [7]. 

Theorem  2:  All  linear  cost  functions  with  s  4-  k  >  0  belong  to  a 
single  equivalence  class  induced  by  strong  equivalence,  where  s  is 
the  cost  of  the  sum  operation  and  k  is  the  constant  part  of  the  linear 
cost  function. 

Proof:  Given  a  costtable  F ,  let  a  minimal  realization  of 
/  E  Un>r  be 

f  —  f\  +  /*2  +  •  •  1  +  fm 
where  /,  E  F,  which  is  achieved  at  cost 

cF(f)  =c(Jx)  4-c(/2)  4-  •  +c(fm)  +  (m  -  1  )s 

where  c  is  a  linear  cost  function.  The  first  m  terms  on  the  r.h.s.  sum 
to  c(J )  4  (m  -  l)Ar,  where  k  is  the  constant  term  in  the  linear  cost 
function.  Thus, 

cF(f)=c(f)  +  (m-l)(s  +  k). 

Since  s  4  k  >  0,  it  follows  that  this  minimal  realization  of  /  is 
achieved  by  summing  the  least  number  of  functions  from  F  (small¬ 
est  m).  It  follows  that  any  linear  cost  function/sum  pair,  where 
s  -\-k  >  0,  is  strongly  equivalent  to  any  other  such  cost  function/sum 
pair.  Q.E.D. 

Theorem  2  shows  that  there  is  no  difference  in  the  minimal  real¬ 
izations  of  functions  from  a  costtable  between  linear  cost  functions 
such  as  the  sum  cost  and  the  constant  0  cost  discussed  in  [7].  Thus, 
an  algorithm  for  finding  a  minimal  realization  of  a  function  or  for 
finding  a  minimal  costtable  applies  to  all  cost  functions. 

IV.  Weak  Equivalence  Between  Cost  Function/Sum  Pairs 

While  strong  equivalence  exists  over  the  restricted  class  of  linear 
cost  functions,  we  show  in  this  section  any  cost  function  is  related 
to  this  class  of  functions  if  the  cost  of  the  sum  operation  is  large 
enough. 

Definition:  Let  c  and  d  be  cost  functions  and  s  and  t  be  the 
corresponding  costs  of  combining  functions,  respectively.  Then,  cost 
function/sum  pair  ( c ,  s )  is  weakly  equivalent  to  (d,  t )  iff  for  any 
costtable  F  and  any  function  /  E  U„yr,  there  is  a  minimal  realization 
of/ using  (c,  5)  that  is  identical  to  a  minimal  realization  of /  using 

(d,  0. 

Like  strong  equivalence,  weak  equivalence  preserves  the  relative 
costs  of  implementations  among  functions  realized  by  a  costtable. 
However,  unlike  strong  equivalence,  it  is  not  necessary  that  all  min¬ 
imal  realizations  of  any  function  under  one  cost  function/pair  be  a 
minimal  realization  under  the  other,  only  that  one  such  minimal  re¬ 
alization  exists.  Since  weak  equivalence  is  an  equivalence  relation,  it 
divides  the  set  of  all  cost  function/sum  pairs  into  equivalence  classes. 

Lemma  1:  For  any  cost  function  c,  cost  function/sum  pair  (c,  s ) 
is  weakly  equivalent  to  (LC,  t),  where  LC  is  a  linear  cost  function, 
for  sufficiently  large  s. 

Proof:  When  s  is  sufficiently  large,  the  least  cost  realization 
of  any  function  /  E  l/«,r  is  a  realization  requiring  the  fewest  cost- 


table  functions.  Thus,  a  minimal  realization  using  cost  function/sum 
pair  (c,  5)  is  a  minimal  realization  using  cost  function/sum  pair 
(LC,  /),  and  so  the  two  cost  function  pairs  are  weakly  equivalent. 
The  minimal  realization  of  a  function  /using  cost  function/pair  (c,  s) 
is  the  lowest  cost  realization  using  the  minimal  number  of  costtable 
functions.  Q.E.D. 

The  smallest  value  of  s  for  which  the  observation  is  true  is  that 
value  which  guarantees  that  there  are  no  realizations  of  /  with  more 
than  the  minimal  number  of  costtable  functions  with  lower  cost  than 
one  with  the  minimal  number  of  costtable  functions.  As  an  exam¬ 
ple  of  these  ideas,  consider  the  area  cost  function  AC  [7]  and  the 
following  costtable 


Function 

Cost 

1) 

(0001) 

4 

2) 

(0010) 

10 

3) 

(0100) 

10 

4) 

(1000) 

7 

5) 

(0033) 

13 

6) 

0111) 

1 

7) 

(3300) 

20. 

The  least  cost  realization  of  (3333)  using  three  costtable  functions  is 
<mi)  4  (1111)  4-  (1111)  at  a  cost  of  3  +2s.  Using  two,  the  min¬ 
imal  number  of  costtable  functions,  the  least  linear  cost  realization 
is  (0033)  4-  (3300)  at  a  cost  of  33  4-  s.  Thus,  if  the  latter  is  to  be 
a  minimal  realization,  s  >  33  -  3  =  30.  Thus,  30  is  a  lower  bound 
on  the  value  of  s  such  that  the  area  cost  function/sum  pair,  (AC,  s), 
is  weakly  equivalent  to  (LC,  /).  This  is  considerably  higher  than  the 
sum  cost  2  used  with  the  area  cost  function  [2]-[7]. 

V.  Concluding  Remarks 

The  linear  cost  function  is  fundamentally  important  in  the  costtable 
approach  to  the  design  of  logic  circuits.  We  have  shown  that  the  re¬ 
alizations  by  costtable  of  any  function  are  the  same  for  any  choice  of 
a  linear  cost  function/sum  pair,  such  that  s  +k  >0.  Furthermore,  we 
have  shown  that  the  costtable  realizations  of  an  arbitrary  cost  func¬ 
tion/sum  pair  are  identical  to  those  produced  by  a  cost  function/sum 
pair  where  the  sum  cost  is  0. 

We  have  shown  that  a  weaker  relationship  exists  between  cost  func¬ 
tion/sum  pairs  considering  just  changes  in  the  sum,  the  cost  of  com¬ 
bining  functions.  That  is,  any  cost  function/sum  pair  is  weakly  equiv¬ 
alent  to  the  linear  cost  function/sum  pair,  in  the  sense  that  at  least 
one  minimal  realization  of  any  function  is  the  same,  for  a  sufficiently 
large  sum  cost. 
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Experimental  Results  on  Subgoal  Reordering 

XUMIN  NIE  and  DAVID  A.  PLAISTED 

Abstract—  Subgoal  reordering  is  the  problem  of  determining  the  order 
for  solving  a  set  of  subgoals  of  a  goal,  in  order  to  improve  the  efficiency 
of  some  search  process.  In  this  paper,  we  report  our  investigations 
into  the  effect  of  subgoal  reordering  on  the  performance  of  a  goal 
oriented  theorem  prover,  when  some  simple  syntactic  heuristics  are  used 
to  perform  subgoal  reordering.  We  show  that  subgoal  reordering  using 
these  simple  heuristics  has  a  considerable  impact  on  the  performance  of 
the  prover  on  a  large  set  of  test  problems.  Some  heuristics  even  provide 
equally  good,  and  often  better,  performance  in  comparison  to  the  hand 
ordering  of  the  input  clauses.  The  merit  of  our  approach  seems  to  be 
that  we  are  considering  the  syntactic  aspect  of  theorem  proving.  This 
aspect  is  simple  in  form,  cheap  in  its  evaluation,  and  often  provides 
good  heuristics,  as  has  been  demonstrated  by  our  results. 

Index  Terms — Depth-first  iterative  deepening  search,  heuristics,  prob¬ 
lem  reduction  format,  subgoal  reordering,  theorem  proving. 


I.  Introduction 

Goal  oriented  theorem  proving  systems  have  some  distinctive  ad¬ 
vantages  over  some  systems  based  on  resolution  [  1  ] .  In  a  goal  oriented 
system,  a  goal  is  expressed  in  terms  of  subgoals  and  the  solutions 
for  a  goal  are  composed  of  the  solutions  for  its  subgoals.  One  of  the 
advantages  of  these  systems  is  that  it  is  easy  to  incorporate  heuristic 
considerations  with  these  systems .  One  important  such  consideration 
is,  for  example,  to  detect  unachievable  goals  by  a  semantics  test  [7], 
[2],  [10].  Another  heuristic  consideration  is  to  choose  the  order  in 
which  the  subgoals  of  a  goal  are  solved,  since  the  order  in  which  the 
subgoals  are  solved,  on  one  hand,  often  does  not  affect  the  solvability 
of  a  goal  and,  on  the  other  hand,  can  have  a  large  effect  on  the 
efficiency  of  solving  the  goal.  In  this  paper,  we  will  discuss  our 
research  on  this  aspect  of  heuristic  consideration  in  a  goal  oriented 
theorem  prover. 

We  will  define  the  terminology  first.  A  term  is  a  well-formed 
expression  composed  of  variables  and  function  symbols.  An  atom  is 
an  expression  of  the  form  P{t\  ,  where  t\ ,  •  •  •  are  terms 

and  P  is  a  predicate  symbol.  A  literal  is  an  atom  or  an  atom  preceded 
by  a  negation  sign  A  literal  is  positive  if  it  is  an  atom,  negative  if 
it  is  an  atom  preceded  by  A  clause  is  a  disjunction  of  literals.  A 
Horn-like  clause  is  of  the  form  L\-Lx ,  L2,  •  ■  >Ln ,  which  represents 

permute  ([L, ,  L2,-  •  •,£„],  \MU  A/2,-  •  ■ 

[r0  -»  Mi  =►  r,  -M,],  [ r, 


the  clause  L  V  ->L  i  V  ~^L2 - Ln ,  where  L  is  called  the  head  literal 

and  L]y  ',Ln  constitute  the  clause  body.  A  general  clause  C  is 
converted  into  a  Horn-like  clause  HC  as  follows.  One  of  the  positive 
literals  in  C  is  chosen  as  the  head  literal  of  HC  and  all  other  literals 
in  C  are  negated  and  put  in  the  clause  body  of  HC.  If  C  contains 
only  negative  literals,  we  use  the  special  literal  FALSE  as  the  head 
literal  of  HC.  A  clause  only  containing  negative  literals  is  called  a 
goal  clause. 
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A .  Modified  Problem  Reduction  Format 

The  theorem  prover  we  use  is  an  implementation  of  the  modified 
problem  reduction  format  [6].  We  will  present  the  system  briefly 
here  to  illustrate  the  structure  of  the  inference  system.  The  modified 
problem  reduction  format  accepts  Horn-like  clauses  as  input.  It  has 
an  inference  rule  per  input  clause  plus  the  assumption  axioms  and 
the  case  analysis  rule.  To  be  specific,  assume  5  is  a  set  of  Horn-like 
clauses.  We  obtain  a  set  of  inference  rules  from  S  for  the  modified 
problem  reduction  format  as  follows.  For  each  Horn-like  clause  L\- 
Lu  L2y-  •  •  ,Z,„  in  S,  we  have  a  clause  rule.  We  call  the  T’s  on  the 
left  of  the  arrow  — ►  the  assumption  list. 

Clause  Rules: 


[To  — ►  L\  T  i  — >  Li],[Ti  — ►  L2  =>r2  — *  L2],  •  •  •  >  [F„_  j  >Ln=$-  Trt  ►L,,] 

_________  • 

We  also  have  the  assumption  axioms  and  the  case  analysis  rule. 

Assumption  Axioms: 

r^L^r  —  LifLcF  Lisa  literal. 

1  — >  L  =4>  F,  ->  L  L  is  positive. 

Case  Analysis  Rule: 

[Tq^l^Tu  ->l],  [r t ,  a/  — > l  =»  r, ,  m  — > l] ,  |r0|  <  |r,| 

r0  — >  l  =>  r  i  — >  l 

The  goal-subgoal  structure  of  the  modified  problem  reduction  for¬ 
mat  is  evident,  when  used  in  a  back  chaining  manner.  Each  clause 
rule,  thus  each  input  clause  L\-L\ ,  L2,  ■  ,L„,  can  be  regarded  as  a 

decomposition  of  a  goal  L  into  a  set  of  subgoals  L\ ,  L2 ,  • ,  Ln .  The 

assumption  lists  are  introduced  for  guaranteeing  the  completeness  of 
the  system  and  are  not  of  concern  to  our  discussion. 

II.  Subgoal  Reordering 

As  we  have  stated,  an  input  clause  L\-L\ ,  L2 ,  ■  ••  ,Ln  decomposes 
a  goal  L  into  a  set  of  subgoals  L\ ,  L2,  ■  •  -  ,L„.  Furthermore,  the 
subgoals  L\ ,  L2,  •  ,L„  can  be  attempted  in  any  order.  Our  work  is 

based  on  this  observation.  We  formally  state  the  fact  that  the  subgoals 
of  an  input  clause  can  be  attempted  in  any  order  as  follows: 

Theorem:  The  modified  problem  reduction  format  is  still  sound 
and  complete  if,  for  a  Horn-like  clause  L:-L\ ,  L2,  •  •  ■  ,Ln ,  the  clause 
rule  is 


-> al2  =^r2  —  Af„  —  m„] 

To  L  T„  —  L 

where  permute  ([L, ,  ■  ,L„],  [Mt ,  •  ,M„])  produces  an  arbitrary 

permutation  M\ ,  •  ■  •  ,M„  of  Lu  -,Ln  each  time  it  is  called. 

Proof:  We  note  that  the  soundness  and  completeness  proofs  of 
the  modified  problem  reduction  format  in  [6]  do  not  claim  any  order 
of  the  literals  in  the  clause  body  each  time  a  clause  rule  is  used. 
We  can  conclude  that  the  order  produced  by  permute  is  correct. 

□ 

The  theorem  implies  that  we  can  order  the  subgoals  in  a  clause 
rule  during  the  proof  process,  when  the  rule  is  invoked.  The  theo¬ 
rem  prover  will  require  less  user  guidance  if  it  orders  the  subgoals 
automatically  during  the  proof.  In  the  case  of  logic  programming, 
for  instance,  the  user  usually  has  a  very  good  idea  about  what  the 
order  of  the  subgoals  should  be.  Thus,  ordering  subgoals  may  not  be 
relevant.  But  in  theorem  proving,  a  user  may  not  have  the  knowledge 
to  specify  a  good  order  in  the  input.  To  order  subgoals  automatically 
can  provide  a  partial  answer  to  this  problem.  The  problem  is  how  to 
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order  the  subgoals.  We  call  the  process  of  determining  the  order  of 
the  subgoals  in  a  clause  rule  during  the  proof  process  subgoal  re¬ 
ordering.  There  are  a  couple  of  issues  involved  and  we  will  discuss 
each  of  them. 

The  first  issue  is  how  to  measure  the  quality  of  an  ordering.  It  is 
hard  to  give  a  precise  quantitative  answer  in  general.  We  can  roughly 
say  that  an  ordering  is  good  if  it  can  make  the  search  more  efficient. 
To  be  specific,  we  can  order  the  subgoals  so  that  the  most  important 
subgoal  is  attempted  first  or  to  reduce  the  branching  factors  of  the 
search  space.  To  this  end,  we  have  defined  some  evaluation  func¬ 
tions  which  measures  the  “quality”  or  “importance”  of  the  subgoals 
and  used  the  values  of  the  evaluation  functions  to  select  the  order¬ 
ing.  This  raises  the  question  about  what  the  evaluation  functions 
should  measure.  One  requirement  for  the  evaluation  functions  is  that 
the  application  of  them  incurs  low  overhead,  since  it  is  going  to  be 
a  frequent  activity  to  apply  the  evaluation  functions  to  the  subgoals 
if  the  subgoals  are  ordered  during  the  proof.  We  have  considered 
several  evaluation  functions. 

F 1  To  evaluate  the  size  of  the  subgoals  where  the  size  of  a  sub¬ 
goal  is  the  number  of  occurrences  of  predicate  symbols,  func¬ 
tion  symbols,  and  variables. 

F2  To  evaluate  the  mass  of  the  subgoals.  Given  a  set  of  clauses 
S,  the  mass  of  a  symbol  T  (predicate  or  function  symbol), 
denoted  by  mass(T),  is  defined  to  be 


preserved  if  the  evaluation  functions  assign  the  same  value  to  the 
subgoals. 

HI  Subgoal  having  largest  size  first .  This  heuristic  is  based 
on  several  considerations.  1)  A  larger  subgoal  usually  has  a 
smaller  branching  factor  since  the  larger  size  imposes  more 
constraints  on  unification.  2)  A  larger  subgoal  has  a  more 
complex  structure.  This  can  be  regarded  as  containing  more 
information,  thus  being  more  important.  3)  In  our  prover, 
the  solution  size  contributes  to  the  cost  of  solving  a  subgoal. 
Attempting  the  larger  subgoal  first  can  make  the  potentially 
unsuccessful  search  path  stop  earlier  since  larger  subgoals 
will  use  larger  solutions,  thus  contributing  more  to  the  cost. 

H2  Subgoal  having  the  biggest  mass  first.  This  heuristic  is 
used  in  [10]  for  the  level-subgoal  reordering  in  their  prover 
based  on  hierarchical  deduction.  The  subgoal  with  largest 
mass  is  likely  to  contain  nonvariable  symbols  which  occur 
less  frequently  or  to  contain  more  nonvariable  symbols.  Non¬ 
variable  symbols  occurring  less  are  more  likely  to  be  the 
symbols  in  the  theorem  or  the  skolem  function  symbol.  Thus, 
the  subgoal  with  largest  mass  can  be  regarded  as  being  the 
most  important.  Also  a  subgoal  with  large  mass  is  likely  to 
have  a  small  branching  factor. 

H3  Subgoal  with  the  least  number  of  solutions  with  the  same 
predicate  symbol  as  the  subgoal  first.  The  number  of  solu- 


mass  (T)  = 


Number  of  literals  in  S 
Number  of  occurrences  of  T  in  S ' 


For  a  term  t , 


mass  (0  - 


'  mass  (C) 

0 

mass  (5) 

,  mass(/)  +  mass(ti)  H - (-  mass(/„) 


if  /  is  a  predicate  or  function  symbol  C 
if  t  is  variable 
if  t  = 

if/  =/(fi,*2.  •••,/«). 


F3  The  number  of  solutions  with  the  same  predicate  symbol 
as  a  subgoal. 

The  second  issue  concerns  what  algorithm  to  use  to  select  the  or¬ 
dering  for  a  set  of  subgoals.  Given  n  subgoals,  there  are  n\  possible 
orderings.  An  exhaustive  search  would  be  too  costly  and  probably 
not  worthwhile,  due  to  the  quality  of  the  evaluation  functions.  We 
have  used  a  greedy  algorithm  instead.  For  a  subgoal  T0  — >  L,  when 
a  clause  rule  corresponding  to  the  input  clause  L\-Lx ,  L2>  ■  •  •  ,Ln  is 
used,  the  algorithm  will  be  called  to  determine  an  ordering  among 
Lx ,  L2  >  •  •  • , L„ .  The  algorithm  first  applies  the  evaluation  function  to 
each  of  L\ ,  Li>  ••  •  ,L„ ,  then  sorts  them  according  to  their  evaluation 
function  values.  The  resulting  order  among  L\ ,  L2»  •  ■  ■  ,Ln  will  be 
the  order  in  which  they  will  be  attempted.  We  call  this  static  reorder¬ 
ing  because  an  ordering  among  the  subgoals  will  be  determined  prior 
to  any  attempt  to  solve  any  subgoal,  when  a  clause  rule  is  invoked.  A 
slight  variation  to  the  algorithm  leads  to  the  dynamic  reordering.  In 
dynamic  reordering,  no  order  will  be  determined  prior  to  attempting 
any  subgoal.  Rather,  each  time  a  subgoal  is  to  be  attempted,  a  sub¬ 
goal  will  be  selected  from  the  remaining  subgoals.  To  be  specific, 
for  any  goal  T0  — >  L,  whenever  a  clause  rule  corresponding  to  the 
input  clause  L\-Lx,  L2,-  ■  •  ,Ln  is  used,  a  subgoal  L[  among  the  n 
subgoals  L] ,  L2»  •  •  ■  >Ln  will  be  selected  and  T0  — ►  L[  attempted. 
After  T0  — >  L[  returns  with  Tx  — ►£{,  another  subgoal  V2  will  be 
selected  among  the  remaining  n  -  1  subgoals,  etc. 

Dynamic  reordering  can  adjust  the  order  based  on  the  progress 
of  the  search,  such  as  new  variable  bindings  and  newly  derived  so¬ 
lutions.  A  problem  may  arise  from  the  overhead  of  repeatedly  ap¬ 
plying  the  evaluation  function.  If  there  are  n  subgoals,  the  cost  of 
performing  static  reordering  would  be  0(n  log  (/?))  and  the  cost  of 
performing  dynamic  reordering  would  be  0(n2)  for  our  algorithm. 
For  short  clauses,  this  would  not  make  a  big  difference.  This  seems 
to  be  the  case  for  most  of  our  test  problems. 

We  have  studied  three  heuristics  for  performing  subgoal  reorder¬ 
ing.  We  note  that  the  order  of  the  subgoals  in  the  input  will  be 


tions  with  the  same  predicate  symbol  as  a  subgoal  does  give 
a  bound  on  the  branching  factor  for  this  subgoal.  In  case  of 
a  tie,  the  subgoal  with  the  biggest  mass  will  be  first. 

III.  Related  Work 

Similar  problems  are  considered  in  some  other  goal  oriented  the¬ 
orem  proving  systems  [10],  [5].  In  [10],  level  goal  reordering  is 
performed  during  the  proof  process  where  the  search  process  is  con¬ 
trolled  by  suitable  selection  of  the  first  literal  to  resolve  upon  in  a 
goal  clause. 1  Its  heuristic  is  to  select  the  literal  with  the  biggest  mass 
or  with  the  most  complex  structure.  In  SLR-based  proof  procedures, 
the  choice  of  the  literal  can  be  made  dynamically  for  the  application 
of  the  extension  operation  [5].  One  heuristic  suggested  is  to  select 
the  literal  which  can  be  resolved  upon  with  the  least  number  of  input 
chains. 

The  problem  we  consider  here  is  similar  in  nature  to  the  con¬ 
junctive  problem  in  [8].  [8]  discusses  the  problem  ordering  a 
conjunctive—  a  set  of  propositions  which  share  variables  and  must  be 
satisfied  simultaneously— in  order  to  reduce  the  size  of  the  search. 
They  use  the  size  of  the  database  to  estimate  the  cost  of  solving  a 
conjunct  and  determine  an  ordering  of  conjuncts  which  has  the  least 
cost  by  possibly  searching  through  n\  possible  orderings  for  n  con¬ 
juncts.  An  adjacency  theorem  is  proven  to  cut  down  the  size  of  the 
search  and  some  heuristics  are  also  suggested  to  avoid  the  search 
completely.  While  the  basic  problem  is  the  same,  some  assumptions 
in  [8]  are  not  valid  in  our  case.  For  example,  the  assumption  that 
all  solutions  to  a  conjunct  are  directly  available  in  the  database  is 
not  valid.  This  assumption  makes  it  possible  to  estimate  the  cost  of 
solving  a  conjunct  rather  easily.  In  our  case,  however,  the  solutions 
to  a  subgoal  are  rarely  directly  available  and  require  possibly  many 
inferences  to  obtain;  we  do  not  know  how  many  inferences  would 

'Here  the  term  goal  clause  does  not  refer  to  an  all-negative  clause.  See 
[10]. 
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TABLE  I 


TABLE  II 


Subgoal  Reordering  Using  HI 


Subgoal  Reordering  Using  H2 


Theorem 

1  No  Reordering  1 

1  Dynamic  Reordering  1 

!  Static  Reordering 

seconds 

inferences 

seconds 

inferences 

seconds 

infereces 

dbabhp 

6.15 

151 

6.50 

167 

6.82 

fex4tl 

590.18 

2858 

1635.57 

3755 

1637.50 

fex4t2 

126.02 

1015 

126.07 

1015 

132.00 

fex5 

231.78 

268.83 

3710 

269.07 

Is  108 

1557.38 

8380 

29.37 

575 

29.25 

ls65 

164.20 

2957 

140.72 

2703 

146.07 

schubert 

748.45 

8921 

649.52 

11014 

761.42 

wosl 

64.22 

1059 

47.32 

922 

53.12 

woslO 

!■> frfcHKM 

3950 

200.65 

3615 

238.52 

3950 

wosll 

235.53 

4222 

224.12 

4070 

247.23 

4166 

wosl5 

20556 

62.38 

1403 

66.80 

1458 

wos31 

|  7742.22 

29783 

21079.78 

78273 

21162.65 

78274 

I  Dynamic  Reordering  | 

I  Static  Reordering 

seconds 

inferences 

seconds 

dbabhp 

6.15 

151 

40.82 

598 

41.55 

598 

fex4tl 

590.18 

2858 

1670.18 

3755 

1681.22 

3755 

fex4t2 

126.02 

1015 

126.98 

1015 

129.27 

1015 

fex5 

231.78 

2970 

259.57 

3644 

266.15 

3632 

Is  108 

1557.38 

8380 

29.43 

575 

30.37 

575 

ls65 

164.20 

2957 

100.37 

1786 

102.78 

1786 

schubert 

748.45 

8921 

2306.73 

695.25 

wosl 

64.22 

1059 

681.03 

5665 

696.50 

woslO 

223.05 

3950 

136.68 

2282 

138.40 

2322 

wosll 

235.53 

4222 

239.52 

3732 

245.40 

3905 

wosl5 

6045.23 

62.08 

1339 

61.57 

1382 

wos31 

7742.22 

29783  | 

5816.27 

5987.40 

28136 

be  required.  Also,  the  cost  of  solving  the  same  subgoal  may  vary  if 
caching  is  performed,  where  caching  implies  that  a  subgoal  need  not 
be  solved  more  than  once.  All  these  make  realistically  estimating  the 
cost  of  solving  a  subgoal  very  difficult.  It  is  also  pointed  out  in  [8], 
in  additon  to  the  difficulty  of  estimating  cost,  an  optimal  ordering 
of  the  conjuncts  cannot  always  be  achieved  by  only  considering  the 
subgoals  of  a  goal  if  inferences  are  required  to  obtain  the  solutions. 
This  implies  that  a  global  data  structure  is  needed  to  store  all  the 
unsolved  subgoals  and  the  optimal  ordering  is  selected  from  all  the 
possible  orderings  of  those  unsolved  subgoals. 

In  our  work,  instead  of  estimating  the  cost  of  solving  a  subgoal, 
we  quantify  certain  syntactic  characteristics  of  the  subgoals  and  use 
a  cheap  greedy  algorithm  to  determine  the  ordering.  We  only  deal 
with  subgoals  belonging  to  one  goal  to  make  the  subgoal  reordering 
process  compatible  with  the  depth-first  iterative  deepening  search  [4] 
used  in  the  prover.  The  major  advantages  of  the  depth-first  iterative 
deepening  search  are  that  it  is  complete  and  requires  little  memory. 
If  the  best-first  search  strategy  were  used,  which  requires  a  lot  of 
memory,  subgoal  reordering  would  not  be  necessary. 

IV.  Experimental  Results 

A  convenient  Prolog  interface  in  the  prover  provides  an  easy  ve¬ 
hicle  to  carry  out  subgoal  reordering.  In  the  input  to  the  prover,  a 
subgoal  of  the  form  prolog(L)  presents  a  call  to  the  Prolog  proce¬ 
dure  L.  We  write  a  Prolog  subroutine,  called  best  subgoal ,  to  order 
a  list  of  subgoals  according  to  the  evauation  function.  Another  Pro¬ 
log  subroutine  is  written  to  translate  the  standard  input  format  into 
the  format  which  includes  the  calls  to  the  Prolog  subroutine  best 
subgoal.  For  example,  the  input  clause  L:-Li,  L2,  L3  is  translated 
into  the  clause 

L:  -prolog  (best-subgoal([L, ,  L2,  L3],  [ X j  |T])),  X\ , 
prolog  (best-subgoal(7,  [X2,  JT3])),  X2,  X2. 
to  perform  dynamic  reordering;  it  is  translated  into  the  clause 
L:-prolog(best-subgoal([L,,  L2 ,  L3],  [Xu  X2l  Af3])), 

Xl9  X2,  Xi, 

to  perform  static  reordering.  The  resulting  clause  will  be  the  input 
to  the  prover. 

We  have  performed  tests  on  the  problem  set  from  [9]  using  the 
three  heuristics.  We  tested  both  static  reordering  and  dynamic  re¬ 
ordering  using  each  heuristic  on  82  problems.  We  show  part  of  our 
experimental  results  in  Tables  I— III.  We  summarize  the  data  in  the 
three  tables  SI,  S2,  and  S3.2  As  we  have  expected,  no  single  heuris¬ 
tic,  when  used  for  subgoal  reordering,  performs  better  on  all  the  test 
problems.  Nevertheless,  there  are  some  interesting  things  revealed 
by  the  data. 

We  first  note  that  subgoal  reordering  incurs  little  overhead.  This 
is  because  the  evaluation  functions  are  easy  to  evaluate,  the  algo¬ 
rithm  for  selecting  the  ordering  is  simple,  and  the  input  clauses  in 

2  The  data  are  obtained  on  a  SUN3/60  workstation  with  12  megabyte  mem¬ 
ory.  The  Prolog  system  is  the  ALS  Prolog  Compiler  (Version  0.60)  from 
Applied  Logic  Systems,  Inc. 


TABLE  III 

Subgoal  Reordering  Using  H3 


Theorem 

1  No  Reordering  ! 

1  Dynamic  Reordering  | 

I  Static  Reordering 

seconds 

inferences 

seconds 

inferences 

seconds 

infereces 

dbabhp 

6.15 

151 

41.78 

598 

41.72 

598 

fex4tl 

590.18 

2858 

603.50 

2901 

612.58 

2901 

fex4t2 

126.02 

1015 

146.33 

1017 

146.48 

1017 

fex5 

231.78 

2970 

217.17 

2712 

215.95 

2712 

Is  108 

1557.38 

8380 

28.93 

541 

30.20 

541 

ls65 

164.20 

2957 

101.82 

1793 

102.32 

1793 

wosl 

64.22 

1059 

3501.13 

12334 

3457.18 

12354 

woslO 

223.05 

3950 

176.95 

2788 

154.53 

2666 

wosll 

235.53 

4222 

274.10 

3938 

266.88 

4088 

wosl  5 

6045.23 

20556 

4109.68 

14304 

3953.87 

14528 

wos31 

7742.22 

29783 

7141.10 

28634 

7090.12 

28648 

schubert 

748.45 

8921 

435.52 

9510 

712.82 

11263 

SI:  Average  Data  for  Subgoal  Reordering  | 

Average  Time 

Average  Inference 

Average  Inference 

Per  Theorem 

Per  Theorem 

Per  Second 

no  reordering 

232.05 

1335.76 

5.76 

dynamic-Hl 

315.04 

1648.72 

5.23 

staiic-Hl 

318.68 

1649.38 

5.18 

dynamic-H2 

154.44 

1175.12 

7.61 

static-H2 

168.82 

1171.12 

6.93 

dynamic-H3 

252.62 

1361.39 

5.38 

static-H3 

253.65 

1407.85 

5.55 

S2:  Problem  Distribution  in  Running  Time  (in  seconds)  i 

(10,  60] 

1W 

(600,  +») 

Total 

no  reordering 

52 

16 

8 

i 

5 

82 

dynamic-Hl 

51 

21 

7 

0 

3 

IK9 

52 

20 

7 

0 

3 

dynamic-H2 

50 

21 

7 

1 

3 

82 

static -H2 

50 

20 

6 

1 

5 

82 

dynamic-H3 

50 

20 

6 

1 

5 

82 

static-H3 

50 

20 

6 

0 

6 

82 

S3:  Comparing  with  no  Reordering  | 

functions 

improvements3 

degeneration3 

Even3 

number 

averagc(%) 

number 

average(%) 

number 

dynamic-Hl 

33 

15.73 

20 

53.4 

29 

static-Hl 

24 

19.8 

17 

31.9 

41 

dynamic -H2 

45 

24.9 

20 

84.0 

17 

static-H2 

38 

24.6 

24 

108.0 

20 

dynamic-H3 

45 

22.5 

20 

164.3 

17 

static-H3 

40 

21.2 

25 

136.9 

17 

3  For  example,  the  two  numbers  33  and  15.73  under  “improvements  for 
dynamic-Hl”  indicate  that  the  prover  with  dynamic  subgoal  reordering  using 
heuristic  HI  does  better  on  33  of  the  82  problems  (takes  fewer  inferences) 
and  the  average  speedup  with  respect  to  the  performance  of  the  prover  with¬ 
out  subgoal  reordering  is  15.73%;  the  two  numbers  20  and  53.4  under  de¬ 
generations  for  dynamic-Hl  indicate  that  the  prover  with  dynamic  subgoal 
reordering  using  heuristic  HI  does  worse  on  20  of  the  82  problems  (takes 
more  inferences)  and  the  average  slowdown  with  respect  to  the  performance 
of  the  prover  without  subgoal  reordering  is  54.4%;  the  number  29  under  even 
for  dynamic-Hl  indicates  that  the  prover  with  dynamic  subgoal  reordering 
using  HI  performs  equally  well  (takes  equal  number  of  inferences)  as  the 
prover  without  subgoal  reordering  on  29  of  the  82  problems. 
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the  problems  are  generally  short  (7  literals  maximal).  For  the  same 
reasons,  dynamic  reordering  is  not  more  expensive  than  static  re¬ 
ordering.  All  these  can  be  seen  from  the  data  in  SI  and  S2.  The  data 
in  S3  suggest  that,  at  least  for  our  heuristics,  dynamic  reordering 
should  be  preferred  if  subgoal  reordering  is  to  be  performed  at  all 
since  dynamic  reordering  does  better  on  more  problems  than  static 
reordering  using  the  same  heuristics. 

The  data  in  S2  suggest  that  subgoal  reordering  does  not  affect  the 
performance  of  the  prover  very  much.  But  the  data  in  SI  seem  to 
suggest  otherwise.  This  discrepancy  results  from  the  dramatic  im¬ 
provements  or  degeneration  of  the  performance  of  the  prover  when 
performing  subgoal  reordering  on  several  problems  (Is  108,  wosl5, 
and  wos31).  These  problems  are  difficult  for  the  prover  without  sub¬ 
goal  reordering.  This  suggests  that  subgoal  reordering  can  be  a  valu¬ 
able  addition  to  the  prover  for  solving  hard  problems  for  which  we 
can  devise  specific  heuristics. 

One  general  heuristic  does  suggest  itself.  It  seems  that  subgoals 
with  complex  structures  should  be  favored.  The  reasons  are  exactly 
those  behind  HI  and  H2.  Subgoals  with  complex  structures  tend  to 
have  small  branching  factors  and  can  be  seen  as  more  important. 
Special  attention  should  be  paid  to  function  symbols  since  they  rep¬ 
resent  objects  in  the  problem  domain.  The  good  performance  of  the 
prover  when  performing  subgoal  reordering  using  H2  enforces  this 
rather  strongly. 

V.  Conclusion 

It  requires  domain  dependent  knowledge  to  find  the  optimal  order¬ 
ing  for  a  set  of  subgoals.  In  case  such  knowledge  is  not  available, 
we  have  to  resort  to  general  heuristics.  We  have  tested  several  such 
heuristics  and  shown  that  they  can  have  great  impact,  sometimes  ad¬ 
verse,  on  the  performance  of  a  prover.  But  some  heuristics  seem  to 
work  better  or  equally  well  most  of  the  time.  Such  heuristics  are 
useful  since  they  can  make  the  theorem  prover  more  automatic.  We 
also  point  out  that  our  heuristics  are  almost  purely  syntactic  in  nature. 
Heuristics  of  this  sort  are  simple  in  form  and  impose  low  overhead  in 
their  evaluations;  and  they  often  provide  performance  improvements. 
In  general,  we  think  that  the  importance  of  the  syntactic  aspect  of 
mechanical  theorem  proving  is  not  to  be  ignored,  although  it  may 
not  play  a  decisive  role  in  the  success  of  this  field  in  the  future. 
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A  Parallel  Algorithm  for  Solving  Sparse  Triangular  Systems 

CHIN-WEN  HO  and  R.  C.  T.  LEE 

Abstract— In  this  paper,  we  propose  a  fast  parallel  algorithm,  which 
is  generalized  from  the  parallel  algorithms  for  solving  banded  linear 
systems,  to  solve  sparse  triangular  systems.  We  transform  the  origi¬ 
nal  problem  into  a  directed  graph.  The  solving  procedure  then  consists 
of  eliminating  edges  in  this  graph.  The  worst  case  time-complexity  of 
this  parallel  algorithm  is  0(log2n)  where  n  is  the  size  of  the  coeffi¬ 
cient  matrix.  When  the  coefficient  matrix  is  a  triangular  banded  ma¬ 
trix  with  bandwidth  m,  then  the  time-complexity  of  our  algorithm  is 
0(log(m)  log(/i)). 

Index  Terms— CREW  PRAM,  cyclic  reduction,  directed  graph  model, 
parallel  computation,  presubstitution,  recursive  doubling,  sparse  trian¬ 
gular  linear  systems. 

I.  Introduction 

Since  many  engineering  and  natural  science  problems  can  be  for¬ 
mulated  as  the  problem  of  computing  the  solution  of  a  linear  system 
of  equations  Ax  =  b,  efficient  algorithms  to  solve  linear  systems 
have  always  been  interesting  to  a  large  number  of  researchers.  In 
recent  years,  because  of  the  availability  of  multiprocessor  systems  as 
well  as  vector  computers,  an  avalanche  of  papers  on  parallel  algo¬ 
rithms  for  linear  systems  have  been  published  [  1]— [3],  [7],  [9]— [17], 
[19] -[22]. 

In  this  paper,  we  shall  consider  the  linear  system  problem  whose 
coefficient  matrix  A  is  sparse  and  triangular.  We  propose  a  parallel 
algorithm  whose  worst  case  performance  is  0(log2  rc),  where  n  is 
the  size  of  matrix  A.  Thus,  our  algorithm  is  superior  to  that  proposed 
in  [21]. 

This  paper  is  organized  as  follows.  Section  I  gives  an  introduction 
of  the  problem.  Section  II  introduces  the  parallel  algorithm  proposed 
by  Wing  and  Huang  [21],  Our  algorithm  and  its  performance  analysis 
are  given  in  Section  III.  Section  IV  gives  concluding  remarks. 

II.  Previous  Results 

A  triangular  matrix  A  =  [a,  j\,  1  <  /,  j  <  n  is  a  matrix  whose 
nonzero  elements  occur  only  in  the  lower  or  upper  triangle  of  the 
matrix.  A  sparse  triangular  matrix  is  a  triangular  matrix  where  there 
are  a  few  nonzero  entries.  Without  losing  generality,  we  may  assume 
that  nonzero  elements  occur  only  in  the  lower  triangle  and  diagonal 
elements  are  all  1 .  (If  an  f  1  for  some  i,  then  we  may  divide  the  ith 
row  and  by  an  without  changing  the  solution  values.) 

Let  us  consider  the  following  sparse  triangular  system: 
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*2  =2 

-2  1 
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*3  =  2  +  'l.X  2 

-1  -3  1 

X4 

1 

X4  =  1  +  *1  +3*3 

_  -2  2  1  _ 

-*5- 

_1_ 

*5  =  1  +  2*i  —  2*3 

To  solve  this  system,  Wing  and  Huang  [21]  proposed  that  we  may 
construct  a  directed  graph  as  Fig.  1.  In  the  diagram,  each  block 
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